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Abstract 

Background: Toll-like receptors (Tlrs) are major molecular pattern recognition receptors of the innate immune 
system. Atlantic cod {Gadus morhua) is the first vertebrate known to have lost most of the mammalian Tlr 
orthologues, particularly all bacterial recognising and other cell surface Tlrs. On the other hand, its genome encodes 
a unique repertoire of teleost-specific Tlrs. The aim of this study was to investigate if these duplicate Tlrs have been 
retained through adaptive evolution to compensate for the lack of other cell surface Tlrs in the cod genome. 

Results: In this study, one tlr21, 12 tlr22 and two tlr23 genes representing the teleost-specific Tlr family have been 
cloned and characterised in cod. Phylogenetic analysis grouped all tlr22 genes under a single clade, indicating that 
the multiple cod paralogues have arisen through lineage-specific duplications. All tlrs examined were transcribed in 
immune-related tissues as well as in stomach, gut and gonads of adult cod and were differentially expressed during 
early development. These tlrs were also differentially regulated following immune challenge by immersion with 
Vibrio anguillarum, indicating their role in the immune response. An increase in water temperature from 4 to 12°C 
was associated with a 5.5-fold down-regulation of tlr22d transcript levels in spleen. Maximum likelihood analysis 
with different evolution models revealed that tlr22 genes are under positive selection. A total of 24 codons were 
found to be positively selected, of which 19 are in the ligand binding region of ectodomain. 

Conclusion: Positive selection pressure coupled with experimental evidence of differential expression strongly 
support the hypothesis that teleost-specific tlr paralogues in cod are undergoing neofunctionalisation and can 
recognise bacterial pathogen-associated molecular patterns to compensate for the lack of other cell surface Tlrs. 

Keywords: Atlantic cod, Toll-like receptors, TLR, Innate immunity, Positive selection, Thermal stress, 
Neofunctionalisation 



Background The TIR domain is highly conserved across all trans mem- 

Toll-like receptors (TLRs) are an integral part of the in- brane TLRs and initiates signal transduction, while the 

nate immune system in all organisms and form one of variable extracellular domain is composed of leucine-rich 

the first lines of defence against invading pathogens. repeats (LRR) motifs that are involved in recognising spe- 

They are a class of pathogen recognition receptors cific PAMPs [2]. To date, 21 different TLRs have been 

(PRRs) that elicit specific responses against pathogens identified across numerous vertebrates [3]. Based on 

upon recognising pathogen-associated molecular pat- phylogenetic analyses, they are organised in six major 

terns (PAMPs) [1]. Most TLRs are type-I transmem- families: TLR1 (TLRs 1, 2, 6, 10 and 14), TLR3, TLR4, 

brane proteins that are composed of three domains: an TLR5, TLR7 (TLRs 7, 8, 9) and TLR11 (TLRs 11 to 13 

intracellular Toll/interleukin-1 receptor (TIR) domain, and TLRs 21 to 23) [3]. Avian, amphibian and teleost 

a transmembrane region and an extracellular domain. genomes encode for most of the mammalian orthologues, 

as well as additional TLRs [4-6]. Tlrl5 has been identified 
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in teleosts. Tlr21, Tlr 22 and Tlr 23 are generally termed as 

Full list of author information is available at the end of the article 'teleost-specific Tlrs; since they are present in several 
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teleost taxa [7]. Nevertheless, a putative Tlr21 has been 
identified in chicken (Gallus gallus) [8], while Tlr21 and 
Tlr22 have been found in Xenopus tropicalis [5]. 

Even though the role of TLRs in detecting pathogens 
is well documented, these molecules are also known to 
be activated by endogenous agonists and to be involved 
in other biological functions. Early studies in Drosophila 
melanogaster have demonstrated that they control the 
formation of the dorso-ventral axis during embryogen- 
esis [9]. Heat shock proteins, inflammatory mediators 
and fragments of molecules from extracellular matrix, 
which are mainly generated in response to stress or as a 
consequence of tissue injury, have the potential to acti- 
vate TLRs [2]. 

In spite of a large degree of conservation between teleost 
TLRs and their mammalian orthologues, there are some 
differences in signalling and their ability to recognise 
PAMPs [4]. Unlike in mammals, there is not always a one 
to one relationship between teleost Tlr families and the 
PAMPs that they recognise. Immunostimulation experi- 
ments have revealed that several teleost TLRs respond to 
PAMPs from bacterial and viral origin [4]. In particular, 
the teleost-specific Tlr22 is known to recognise dsRNA in 
tiger pufferfish, Takifugu rubripes [10], but it also responds 
to other PAMPs from Gram-positive and Gram-negative 
bacteria in other teleosts [4,11-13]. 

The recently published Atlantic cod (Gadus morhua) 
genome draft has uncovered a unique feature of its im- 
mune system: the absence of the genes encoding for major 
histocompatibility complex (MHC) II, CD4 and invariant 
chain, which are key components of the adaptive immune 
system in jawed vertebrates [14]. However, this fish has a 
large number of MHC I genes and a unique repertoire of 
TLR families in its genome. The cod genome encodes four 
of the mammalian homologues (tlr3, tlr7, tlrS and tlr9), 
and all three teleost-specific tlrs (tlr21, tlr22 and tlr23), 
representing three of the six TLR families. It has lost all 
cell surface receptors as well as bacterial recognising 
mammalian homologues from the TLR1, TLR4 and TLR5 
families. A single copy of tlrl4 has been identified in the 
cod genome, but the ligand specificity of this Tlr family 
member is still unknown. 

Gene duplication is a major force of adaptive genome 
evolution, since it allows duplicate genes to explore differ- 
ent aspects of the multidimensional functional space [15]. 
Even if most duplicates degenerate into pseudogenes 
(nonfunctionalisation or pseudogenisation) within 50 mi- 
llion years following the duplication event, a remarkable 
number of gene duplicates are found in vertebrate ge- 
nomes [16]. One of the main mechanisms that account 
for the increased probability of retaining duplicate genes is 
the acquisition of a novel function (neofunctionalisation) 
by one of the copies, which is no longer required to main- 
tain the original functions [17]. An alternative model, 



which is not incompatible with subsequent neofunctiona- 
lisation, is the sharing of ancestral functions between gene 
duplicates (subfunctionalisation), namely partitioning of 
spatio-temporal expression domains [18]. The relative 
contribution of neofunctionalisation and subfunctionali- 
sation in early vertebrate evolution is still a matter of 
controversial debate and little is known about the role of 
adaptive and/ or non-adaptive pressures in the mainten- 
ance of duplicate genes (reviewed in [19]). One of the 
factors that make it difficult to distinguish these pro- 
cesses is the long divergence time, which clouds direct 
tests of selection on ancient evolutionary events. Analyses 
of more recent duplications, such as the ones found in 
some teleost lineages, may prove useful to overcome this 
issue [20]. 

The expanded teleost-specific Tlr family in cod is so 
far unique amongst teleosts and provides a good model 
to better understand how and why so many duplicate 
genes have been retained during vertebrate evolution. It 
is plausible that these multiple teleost-specific paralo- 
gues are retained through adaptive evolution to compen- 
sate for the lack of other cell surface Tlrs in the cod 
genome. To address this hypothesis, we have examined 
the molecular evolution and differential expression of all 
teleost-specific Tlrs present in the current cod genome 
assembly. 

Methods 

Sources of biological samples 

Tissue and embryo samples from naive fish 

Two-year old Atlantic cod (Codfarmers ASA, Norway), 
reared in land based tanks at Morkvedbukta research 
station (University of Nordland, Norway) were used for 
this study. The flow-through rearing system was sup- 
plied with sea water at 7-8°C and the fish were fed daily 
with a commercial diet (Amber Neptun, Skretting AS, 
Stavanger, Norway). Adult fish were humanely killed by 
immersion in an anaesthetic bath containing 0.5 g-L" 1 
tricaine methanesulfonate (Sigma) in accordance with 
the national guidelines detailed in the "Norwegian Regu- 
lation on Animal Experimentation" (Forsoksdyrutvalget, 
Norway). Head-kidney, kidney, spleen, liver, stomach, 
gut, heart, gills, muscle, skin, brain, blood and gonads 
were collected, snap-frozen in liquid nitrogen and stored 
at -80°C for subsequent RNA extraction. 

Cod eggs for this study were kindly provided by Cod- 
farmers ASA (Norway). Unfertilised eggs were immedi- 
ately frozen in liquid nitrogen and stored at -80°C until 
RNA extraction. Eggs from individual cod spawning 
pairs were artificially fertilised in drum-filtered (30 \im) 
UV treated seawater (7°C) and maintained without aer- 
ation at a density of 10 mL-L" 1 . Up to one third of the 
seawater was replaced on a daily basis, so as to keep the 
oxygen concentration above 6.5 mg-L" 1 . Embryos at 
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different developmental stages (4-cells, 16-cells, oblong, 
25% epiboly, 75% epiboly, 10-somite, 30-somite and 
golden eye) and larvae (hatched, bladder stage, hindgut 
stage and first feeding) were observed under an optical 
microscope and approximately 50 specimens from each 
stage were collected, snap-frozen in liquid nitrogen and 
stored at -80°C for further analysis. 

Tissue samples from fish exposed to a bacterial pathogen 

This experiment was conducted at the Institute of Mar- 
ine Research, Norway. One hundred and twenty adult 
fish with an average weight of 60 g were equally distribu- 
ted in three 250 L tanks, which were part of a flow- 
through system that was supplied with sea water at 7-8°C. 
The fish were maintained in this system for a period of 
five weeks prior to the challenge experiment. They were 
fed daily with a fishmeal based feed [21] at 1.5% (w/w) of 
their body weight every day. Prior to bacterial challenge, 
initial control samples were collected from six fish, two 
per tank. Thereafter, the water flow was stopped and 
fish in all three tanks were subjected to bath challenge 
with V. anguillarum strain H610 at a concentration of 
2.6-10 7 cfu-ml" 1 for 1 h [21]. Post-challenge samples were 
collected at 4 (4 hpc) and 48 (48 hpc) h after exposure. 
The samples collected included head-kidney, gills and 
spleen, which were immediately snap-frozen in liquid ni- 
trogen and maintained at -80°C for further analysis. 

Temperature stress 

The temperature stress experiment was conducted at the 
indoor facilities of Morkvedbukta research station. Fifty 
adult cod with a mean weight of 263 ± 50 g were evenly 
distributed in two 500 L tanks and fed daily (Amber 
Neptun, Skretting AS, Norway) to 1.5% (w/w) of their 
body weight. Seawater at 4°C was supplied to the rearing 
tanks and the fish were allowed to acclimatise for a 
period of one week prior to the temperature stress ex- 
periment. Initial control samples were collected at the 
start of the experiment. Water temperature was then 
increased from 4°C to 12°C at a rate of 2°Ch" 1 and the 
first post-stress samples were collected at 4 h (4 hps) 
when the water temperature reached 12°C. Fish were 
further maintained at 12°C and the final sample was col- 
lected after 72 h (72 hps). Three fish were taken from 
each tank at each sampling point (n=6) and humanely 
killed as above. Head-kidney and spleen were immedi- 
ately dissected, snap-frozen in liquid nitrogen and stored 
at -80°C prior to RNA extraction. 

RNA extraction and cDNA synthesis 

The above samples were lysed in Lysing Matrix D (MP 
Biomedicals, USA) and total RNA extracted using QIAzol 
(Qiagen, Netherlands) according to the manufacturers 
instructions. Quality and quantity of total RNA were 



assessed by agarose electrophoresis and spectrophotom- 
etry (NanoDrop, Thermo Scientific, USA), respectively. 
Complementary DNA was synthesised using the Quanti- 
tect reverse transcriptase kit (Qiagen, Netherlands). Total 
RNA was treated with gDNA wipeout buffer provided in 
the reverse transcriptase kit to remove any traces of gen- 
omic DNA. Luciferase mRNA (Promega, USA) was used 
as an external control, as previously reported [22]. 

Cloning of Atlantic cod tlr21, tlr22 and tlr23 genes 

Tlr21, Tlr22 and Tlr23 protein sequences from zebra- 
fish (Danio rerio), stickleback (Gasterosteus aculeatus), 
green-spotted pufferfish (Tetraodon nigroviridis) and 
tiger pufferfish were used as queries in Ensembl BLAST 
searches (www.ensembl.org) against the cod genome 
(gadMorl v67.1). In order to predict gene sequences, 
contigs and scaffolds, the above BLAST hits were fur- 
ther analysed using the AUGUSTUS gene prediction 
server at University of Greifswald [23]. Based on pre- 
dicted coding sequences, primers were designed to 
amplify partial coding regions of the respective paralo- 
gues (Additional file 1). Total RNA from head-kidney, 
kidney, spleen and gills were pooled, reverse tran- 
scribed as above and used as PCR template. Following 
amplification by PCR, the products of interest were 
analysed using gel electrophoresis, purified, cloned and 
sequenced as described elsewhere [24]. The GeneRacer 
kit with Superscript III RT (Invitrogen, USA) was used 
to perform RACE PCR in order to obtain full length 
cDNA sequences. Outer and inner gene specific pri- 
mers for both 5' and 3' RACE were designed based on 
the partial sequences obtained above. RACE cDNA was 
synthesised as per the manufacturers protocol using 
total RNA pooled from head-kidney, kidney, spleen and 
gills. PCR products were cloned and sequenced using 
the primers listed in Additional file 1. 

Sequence analysis 

All sequences were analysed and assembled in Codon- 
Code Aligner v3.7.1 (www.codoncode.com/aligner) using 
default settings and their identity determined by 
BLASTN similarity searches against the NCBI non- 
redundant database. Nucleotide sequences were analysed 
for a Kozak consensus sequence to identify the start 
codon using ATGpr (atgpr.dbcls.jp) and the correspond- 
ing protein sequences were obtained using Translate 
(web.expasy.org/translate). Nucleotide data were submit- 
ted to Genbank under the accession numbers shown on 
Table 1. Cod tlr sequences and their teleost homologues 
(Additional file 2), as well as their corresponding protein 
sequences, were aligned with MatGat 2.02 (www.bitincka. 
com/ledion/matgat) using BLOSUM50 to generate iden- 
tity and similarity matrices. Protein domains were pre- 
dicted by ScanProsite (prosite.expasy.org/scanprosite) 
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Table 1 Teleost-specific tlrs of Atlantic cod 


Gene name 


Accession number 


Chromosomal Location 


Sequence length (bp) 


5'-UTR 


CDS 


3'-UTR 


Protein length (aa) 


tlr21 


JX074771 


GeneScaffold_1988 contig373731 


3047 


134 


2913 




970 


tlr22a 


JX074772 


GeneScaffoldJ 177 contig 165664 


1654 




1654 




551 


tlr22b 


JX074773 


GeneScaffoldJ 177 contig 165665 


3406 


262 


2829 


315 


942 


tlr22c 


JX074774 


GeneScaffoldJ 1 76 contig885687 


2408 


- 


2408 


- 


802 


tlr22d 


JX074775 


GeneScaffoldJ 1 76 contig 165725 


3252 


229 


2880 


143 


959 


tlr22e 


JX074776 


GeneScaffoldJ 177 contig885683 


1612 


252 


1360 


_ 


453 


tlr22f 


JX074777 


scaffold03378 contig96110 


2707 


232 


2475 


_ 


825 


tlr22g 


JX074778 


GeneScaffoldJ 685 contig343097 


3082 


272 


2529 


281 


842 


tlr22h 


JX074779 


scaffold00128 contig05698 


2847 


250 


2597 




865 


tlr22i 


JX074780 


contig536615 


3219 


250 


2865 


104 


954 


tlr22j 


JX074781 


contig520640 


2149 




2149 




716 


tlr22k 


JX074782 


GeneScaffold_351 contig605495 


384 




293 


91 


96 


tlr22l 


JX074783 


GeneScaffold_351 contig892392 


2706 




2523 


183 


840 


tlr23o 


JX074784 


scaffold12300 contig717163 


3427 


340 


2850 


237 


949 


tlr23b 


JX074785 


contig 12242 


2165 


131 


1737 


297 


578 



Full length sequences are represented in bold. 



and LRR motifs were mapped manually on the protein 
sequence based on the corresponding tiger pufferfish Tlrs 
[25]. Intron-exon boundaries were identified using the 
Ensembl cod genome sequence and Spidey (www.ncbi. 
nlm.nih.gov/spidey). Synteny analysis was performed 
manually based on the Ensembl assemblies of stickleback 
(v67.1), tiger pufferfish (v67.4), green-spotted pufferfish 
(v67.8), zebrafish (v67.9) and medaka, Oryzias latipes 
(v67.1). 

Phylogenetic inference 

A total of 41 sequences from 14 teleosts (Additional file 2) 
were used to perform the phylogenetic analysis to eluci- 
date the evolution of teleost tlrs. MUSCLE (www.ebi.ac. 
uk/Tools/msa/muscle) was used to align cDNA sequences 
and the best nucleotide substitution model was identified 
using MrModelTest v2.3 [26] and PAUP* v4.0bl0 [27], as 
reported [28]. The best model to describe the data was 
identified based on the Akaike information criterion 
(AIC). Maximum likelihood phylogenetic analysis was 
carried out with PhyML [29] and Bayesian inference was 
performed as detailed elsewhere [28]. The multiple se- 
quence alignment used for phylogenetic reconstruction 
and corresponding tree have been submitted to TreeBASE 
(www.treebase.org/) under the accession ID 13554. 

Quantification of gene expression 
Primer design 

Specific primers were designed to quantify the expres- 
sion of Atlantic cod tlr21, tlr22 and tlr23 paralogues 
using qualitative RT-PCR as well as real-time PCR 
(qPCR) (Table 2). In RT-PCR, eefla was used as an 



internal reference gene for tissue distribution analysis 
while luciferase was used as an external control to deter- 
mine expression across developmental stages, as it has 
been shown that expression of commonly used house- 
keeping genes is not stable during this period, especially 
if it encompasses the maternal-zygotic transition [22]. 
Eefla and ubi were used as reference genes for qPCR. 
Whenever possible, primers were designed across 
intron-exon boundaries and screened for hairpins, 
homo- and cross-dimers using Netprimer (www.pre- 
mierbiosoft.com/netprimer). 

Qualitative RT-PCR (RT-PCR) 

Gene expression across tissues and developmental stages 
for Atlantic cod tlr21, tlr22 and tlr23 was determined 
using RT-PCR. Recombinant Taq DNA polymerase 
(VWR, USA) was used for RT-PCR with the following 
thermocycling parameters: 95°C: 2 min, 35 cycles of (95°C: 
15 sec, annealing temperature (Table 1): 30 sec and 72°C: 
2 min) and 72°C: 7 min. Amplification was carried out in 
Bio-Rad C1000 thermocycler (Bio-Rad, USA). Samples 
were analysed by electrophoresis on 1.5% (w/v) gels and 
then visualised and photographed using the Kodak Gel 
Logic 200 Imaging System (Carestream, USA). 

Real-time PCR (qPCR) 

Quantification of gene expression was performed by real- 
time PCR with SYBR green chemistry on a LightCycler 
480 (Roche, USA), as detailed elsewhere [11]. A dissoci- 
ation step with a gradient from 65°C to 97°C was per- 
formed to check the specificity of the qPCR reaction and 
the absence of primer dimers. Specificity was further 
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Table 2 Primers used for semi quantitative (RT-PCR) and real-time PCR (qPCR) of teleost-specific tlrs in Atlantic cod 



Gene Name qPCR primer (Forward and Reverse) {5'-3 f ) Amplicon (bp) RT-PCR/qPCR annealing Efficiency (%) 

(°C) 



tlr21 


CGTOCAATCGCATCCTCTCAG GCTGCTCCACAACTCAGTCAAG 


177 


58/60 


110 


tlr22o 


GCAGGAAGTOTGGAGACAmATCATOACATOGAGCACAAGTG 


186 


58/60 


98 


tlr22b 


GAGTOGACmGGGACGAA ACATOCTGACGGCACAAG 


128 


58/60 


125 


tlr22c 


TCAGTOCCAATGCCGTAAG ACACAGTCCmAGAACCAAGACAC 


155 


58/62 


130 


tlr22d 


AGAGGAGGGTATGmGATGGCTGTOGCTAAGTOCGCAG^ 


152 


58/62 


116 


tlr22e 


CCAACCTCACAAGATOAACCTGCAAGCGACAACCACTGATA 


120 


58/60 


115 


tlr22f 


CGCTOGACCTGAGACACAAC^ AATCCATCAAACATACCCTCCTC 


131 


58/64 


91 


tlr22g 


GCAGCAAACGAGATGTCCACTCTCCCAGACGATACCATOTC 


178 


58/64 


116 


tlr22h 


GCTOGACCTGACACGCAACA AAGCCAGACGCAGTOAATG 


159 


58/62 


130 


tlr22i 


GCATCGGTAGAGCCTATOTGA GAAATOGTCCGCTOTGAGA 


102 


58/64 


111 


tlr22j 


TGTGATOGAGAACCAGTGATGCTTGTGTCTGCTOmGTGAmCC 


129 


58/62 


92 


tlr22k 


TCCTACAATGGCAACTGGTCTAC CCCAGCCCTCGTCGmG 


129 


58/60 


88 


tlr22l 


CTCTOGGCTGCTOACACmAATCTGGATAGATAGATAACGCTGAGACG 


171 


58/60 


104 


tlr23a 


CC^CGGCTACCACITCCTG GCCTCGCTCGTCCTCCA 


188 


58/62 


110 


tlr23b 


GACTCCAAmCCTCTGCTOA GGTGCTGCTCATOTOTOCT 


163 


58/64 


94 


luciferase 


TCATOTOGCCAAAAGCACTCTG AGCCCATATCC1TGTCGTATCCC 


149 


58/58 


98 


eeflo 


CACTGAGGTGAAGTCCGTO GGGGTCGTOTOCTGTCT 


142 


58/58 


110 


ubi 


GGCCGCAAAGATGCAGAT CTGGGCTCGACCTCAAGAGT 


69 


69/60 


92 



confirmed by Sanger sequencing of qPCR products. C T 
values were calculated with a fluorescence threshold of 0.5 
and the average of two technical replicates was used to 
calculate relative gene expression. Data were normalised 
against eefla and ubi expression using geometric normal- 
isation factors obtained from GeNorm (http://medgen. 
ugent.be/genorm/), as previously described [30]. Relative 
gene expression against the initial control sample was 
determined and statistical analysis was performed by one- 
way ANOVA with Tukeys HSD post-hoc tests using the 
SigmaPlot 12.0 (Systat Software Inc., USA). When the data 
did not meet normality or equal variance requirements, a 
Kruskal-Wallis one-way ANOVA by ranks and median 
tests was performed. Significance levels were set at P < 
0.05. The sample size was too small to exclude a tank ef- 
fect but there was no obvious pattern of differential gene 
expression in one particular tank. 

Tests of selection pressure and divergence 

All complete and partial tlr22 paralogues were used for 
selection pressure analysis, except tlr22a, tlr22e and 
tlr22k, since these genes had only partial sequences of 
1654, 1360 and 293 bp, respectively. Coding sequences 
of the other nine tlr22 paralogues were aligned with 
MUSCLE and a codon alignment was performed using 
the Codon Align software (www.hiv.lanl.gov). The N- 
terminal portion of the codon aligned sequences was too 
variable and hence 210 bp of this region were removed 



prior to positive selection tests. Similarly, the C-terminal 
region coding for TIR domain was not included in the 
analysis, as it is highly conserved across all known trans- 
membrane TLRs. Instead, a codon alignment comprising 
75% (2169 bp) of the total CDS and without stop codons 
was used. The best nucleotide substitution model was 
selected using MrModelTest v2.3 [26] and PAUP* v4.0bl0 
[27] based on AIC. Differences in sequence diversity 
between the regions that code for different domain struc- 
tures were examined by calculating the average number of 
synonymous (dS) and non-synonymous (dN) substitu- 
tions, insertions and deletions in the codon alignments 
using SNAP [31]. 

Codon based Z-tests of selection were performed to test 
the hypothesis of positive selection in MEGA4 [32] using 
the modified Nei-Gojobori method (Jukes-Cantor) and 
calculating the variance with 1000 bootstrap replicates 
[33]. Evolutionary distances between the nine Tlr22 para- 
logues were estimated by Tajimas relative rate test [34]. 
Each pair of paralogues was compared with Tlr22b as out- 
group, since it was the most distant Tlr22 paralogue for 
which the complete sequence was available. In addition, 
tests for positive selection were performed using the max- 
imum likelihood methods implemented in the CODEML 
program of PAML, as detailed elsewhere [35]. The dN/dS 
ratio (co) was calculated using models M0 (neutral), Ml 
(nearly neutral), M2 (positive selection), M7 (beta) and 
M8 (beta & co). Models were compared against each other 
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using likelihood ratio tests (LRTs). Bayesian posterior 
probabilities (p) were calculated for positively selected sites 
using naive empirical Bayes (NEB) and Bayes empirical 
Bayes (BEB). REL, FEL and SLAC analyses were carried 
out in Datamonkey (www.datamonkey.org) to calculate co 
values for each codon, along with the corresponding prob- 
ability values [36]. 

Results 

Expanded teleost-specific tlrs in Atlantic cod 

Homology searches for tlr21, tlr22 and tlr23 paralogues 
in the cod genome assembly identified 15 open reading 
frames that encode proteins with homology to these 
teleost-specific tlrs. In silico gene prediction analysis 
confirmed the presence of one tlr21, 12 tlr22 paralogues 
and two tlr23 paralogues, all encoding a typical Tlr pro- 
tein (Table 1). A partial tlr21 cDNA of 3047 bp was 
sequenced, including the 134 bp 5'-UTR and the 
2913 bp complete coding region corresponding to a 970 
aa protein. Cod Tlr21 shares more than 50% identity 
with its orthologues in zebrafish, tiger pufferfish and 
medaka, as well as with Tlr21a and Tlr21b of orange- 
spotted grouper (Epinephelus coioides). Based on the 
genome assembly, the tlr21 partial sequence was found 
to be encoded by a single exon (Figure 1). Full length 
cDNA sequences along with the 5'- and 3'-UTR regions 
were obtained for four tlr22 paralogues. Tlr22b, tlr22d, 
tlr22g and tlr22i were 3406, 3252, 3082 and 3219 bp 
long and encoding 942, 959, 842 and 954 aa proteins, re- 
spectively. They were composed of five, three, three and 
three exons, respectively (Figure 1). At the protein level, 
they are 62 to 75% identical to each other and share up 
to 73% similarity with other teleost Tlr22 proteins. Par- 
tial coding sequences for seven of the tlr22 paralogues 
were obtained either with or without the UTR regions 
from a minimum length of 1612 bp up to 2847 bp, en- 
coding partial proteins of 453 aa to 865 aa. In the case 



of tlr22k, it was only possible to obtain a short sequence 
of 384 bp, including the 3'-UTR and coding for a 96 aa 
partial protein (Table 1). Complete cDNA sequences were 
determined for both tlr23 paralogues in cod. Tlr23a was 
3427 bp while tlr23b was only 2165 bp. Tlr23a and tlr23b 
were encoded by 5 and 3 exons, respectively (Figure 1), 
corresponding to proteins of 949 and 578 aa, respectively. 
At the nucleotide level, tlr23a and tlr23b were 45% identi- 
cal to each other and shared 47% identity at the protein 
level with tiger pufferfish and green-spotted pufferfish 
Tlr23. 

In general, all tlrs analysed in this study had an N- 
terminal LRR domain, a transmembrane domain and a 
C-terminal TIR signalling domain (Figure 2). Leucine 
rich repeats (LRRs) were mapped manually and the LRR 
C-terminal (LRRCT) domain was also identified. Tlr21 
contained 27 LRRs and a typical CxCx 2 4Cxi 5 C motif in 
its LRRCT domain. Full length cDNAs from tlr22b, 
tlr22d, tlr22g and tlr22i encoded for 27 LRRs and had a 
CxCx 2 4Cx 18 C motif at its LRRCT domain. Tlr23a and 
Tlr23b had CxCx 24 Cx 18 C at their LRRCT domain with 
27 and 14 LRRs, respectively. 

Synteny and phylogenetic analysis of teleost-specific tlrs 
in cod 

Most cod tlrs were mapped to single contigs (Table 1). 
Tlr22a, tlr22b and tlr22e were present in the same 
chromosomal region (GeneScaffold_1177), which was syn- 
tenic in stickleback, tiger pufferfish and green-spotted puf- 
ferfish tlr22 (Figure 3). Tlr22c and tlr22d were found in 
GeneScaffold_1176 and tlr22k and tlr22l were both in 
GeneScaffold_351 along with other genes, but there was 
no identifiable synteny in these regions across other tele- 
ost genomes (Figure 3). 

Bayesian inference from 41 tlr21, tlr22 and tlr23 
sequences from 15 teleost species generated a consensus 
phylogenetic tree that was identical to the maximum 
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Figure 1 Gene structure of teleost-specific tlrs in Atlantic cod. Graphical representation of Atlantic cod tlr21, tlr22 and tlr23 gene structures. 
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likelihood one (Figure 4). All tlr21 genes were grouped 
under a single clade, while tlr22 and tlr23 formed a sep- 
arate cluster. Stickleback tlr21a clustered with other 
teleost tlr21 genes, while tlr21b seemed to have arisen 
from a recent duplication and was more closely related 
to teleost tlr22. It is noteworthy that all tlr22 from cod 
clustered under a single clade, while the two tlr23 para- 
logues clustered along with their homologues from Tet- 
raodontidae. As expected, the tlr22 paralogues encoded 
by salmonids, such as Atlantic salmon and rainbow trout, 
were grouped together and corresponded to closely 
related paralogues, which have probably arisen from the 
salmonid tetraploidisation. Amongst the cod tlr22 paralo- 
gues that are adjacent in the genome (Figure 3), only 
tlr22k and tlr22l clustered together, whereas tlr22a, tlr22b 
and tlr22e or tlr22c and tlr22d did not. Tlr22 encoded by 
basal teleosts belonging to the Ostariophysi superorder 
clustered as a separate clade followed by Salmonidae and 
higher teleosts from the Acanthopterygii superorder. 
Unexpectedly, cod tlr22 paralogues were more distant from 
the ancestral tlr22 sequence than their Acanthopterygii 
orthologues. 

Expression profiles of teleost-specific tlrs in adult cod 
tissues and during early ontogeny 

Tlr21, tlr22 and tlr23 paralogues were widely expressed 
across many tissues, including immune-related organs 
(head-kidney, kidney spleen and gills), liver and gonads 
(Figure 5A). All tissues examined, except ovary, had de- 
tectable levels of tlr21 transcripts with high levels in 



kidney, liver, gills, testis and blood. A differential expres- 
sion pattern across adult fish tissues was observed for 
tlr22 paralogues. Tlr22k transcripts were detected in all 
tested tissues. Tlr22e had the lowest expression in kidney, 
liver and gills, while it was not detected in other tissues. 
All tlr22 paralogues, except tlr22e, were detected in head- 
kidney, kidney, spleen, liver and gills at varied levels. Six 
out of 12 tlr22 paralogues, tlr22a, tlr22c, tlr22d, tlr22h, 
tlr22j and tlr22k, were found to be expressed in stomach, 
while muscle and skin expressed only tlr22k. Testis had 
transcripts of most tlr22 paralogues but tlr22a, tlr22h and 
tlr22k were the only genes to be detected in ovary. Within 
tlr23 paralogues, expression of tlr23b was lower than that 
of tlr23a but they were both expressed in head-kidney, 
kidney, spleen, gills, blood and testis. Tlr23a transcripts 
were also found in liver, heart and brain. 

Tlr22c, tlr22h, tlr22j and tlr22k transcripts were found 
in unfertilised eggs (Figure 5B). Tlr22k was the only 
tlr22 paralogue to be expressed throughout early devel- 
opment and its transcripts were detected at epiboly, so- 
mite stage, golden eye, hatching, bladder and hindgut 
stages. Low expression of tlr21 and tlr22a was detected 
at later stages from hatching until first feeding, while 
tlr23a and tlr23b were not present in any of the develop- 
mental stages examined. 

Differential expression following pathogen challenge 

Teleost-specific tlrs in cod were differentially regulated 
following a bath challenge with V. anguillarum (Figure 6). 
A significant decrease of tlr21 expression was recorded 
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after 48 h in gills (2.3-fold) and spleen (2.2-fold) compared 
to the initial control. In head-kidney, the highest change 
in expression was observed at 4 hpc in tlr22c (3.3-fold 
decrease) and tlr22l (4.2-fold increase), albeit not signifi- 
cant, while most of the other paralogues remained at basal 
levels. Following a 2-fold significant decrease in expression 
of tlr22a and tlr22b at 4 hpc in head-kidney, tlr22a tran- 
scripts reached a 2-fold higher expression at 48 hpc, which 
was also significant compared to the initial control levels. 
Several significant changes in expression of tlr22 paralo- 
gues were also observed in gills and spleen following the 
bath challenge. In gills, tlr22d transcript levels were sig- 
nificantly reduced by 3.5-fold and this level was main- 
tained through to 48 hpc. In the same tissue, a decrease of 
up to 2-fold in tlr22h expression was observed at 4 and 48 
hpc. A significant decrease in tlr22f and tlr22i transcript 
levels was also observed at 48 hpc in gills. In spleen, tlr22d 
(2.4-fold), tlr22h (2.4-fold) and tlr22h (1.2-fold) were 
down-regulated at 4 hpc and an increase in expression of 
tlr22f, tlr22h and tlr22h (2.1-fold) was observed at 48 hpc 



compared to the initial control. Both tlr23a and tlr23b 
followed a similar pattern with significant reduction in 
the expression of tlr23a in gills (2.8-fold) and spleen (2.3- 
fold). 



Response to temperature stress 

Following thermal shock, a significant down-regulation 
of tlr21 and tlr22 paralogues was observed both in 
head-kidney and spleen, and most of the transcripts 
returned to initial levels or were up-regulated at 72 
hps (Figure 7). In both organs, up to 3-fold significant 
reduction in tlr21, tlr22f, tlr22g, tlr22i and tlr22k 
mRNA levels was observed at 4 hps. Tlr22a transcript 
levels did not show much change to stress, but had a 
3.1 -fold increase at 72 hps in head-kidney. Tlr22l ex- 
pression in head-kidney increased by 3-fold following 
thermal stress and then up to 4-fold at 72 hps, albeit 
not significant. The highest change in transcript levels 
was recorded for tlr22d, with a 5.5-fold decrease in 
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Figure 4 Phylogeny of teleost-specific f/rs. Unrooted 
phylogenetic tree of teleost-specific f/rs - tlr21, tlr22 and tlr23. 
Numbers at the nodes indicate posterior probability values from 
Bayesian inference. Posterior probability values were calculated for 
each node by Bayesian analysis based on 250,000 generations. 
Samples were collected every 100 generation and a consensus tree 
was built after burning the initial 1,250 trees. Only probability values 
above 0.8 are indicated: 0.95 to 1 shaded in red, 0.9 to 0.94 in blue 
and 0.8 to 0.89 in green, respectively. Atlantic cod genes are 
highlighted within red boxes. 



spleen at 4 hps. No significant change was observed in 
tlr23 expression with temperature stress. 

Molecular evolution of the cod tlr22 family 
Tests of selection and relative rate tests 

A pairwise codon based Z-test revealed that cod tlr22 
paralogues are evolving at different rates (Table 3). The 
highest dN-dS values were observed between tlr22c and 
tlr22i (2.852, P = 0.003) or tlr22l (2.787, P = 0.003). Even 
tlr22c and tlr22d, which are encoded by adjacent genes 



in the cod genome, were found to be evolving at different 
rates (dN-dS = 2.157, P = 0.016). Tajimas relative rate test 
further confirmed the evolution of cod Tlr22 paralogues 
through pairwise comparison of these protein sequences 
with Tlr22b as outgroup. The test revealed that Tlr22d 
has undergone relatively high divergence compared to all 
other Tlr22 paralogues (Additional file 3). 

Positive selection 

A sliding window analysis of the complete coding se- 
quence of nine tlr22 paralogues performed with SNAP 
revealed that the occurrence of non-synonymous muta- 
tions is not uniform throughout the coding sequence 
(Figure 8A). The average dN/dS ratio for the complete 
coding sequence was 0.748 (dS = 0.223, dN = 0.167), while 
the ratio for the LRR region was much higher (dN/dS = 
0.815) than for the TIR region (dN/dS = 0.313). These dif- 
ferences in substitution rates confirm that the TIR domain 
within teleost-specific Tlrs in cod is more conserved than 
the LRR region. Thus, the site-specific positive selection 
analysis focused on the latter. Likelihood ratio tests (LRTs) 
revealed that PAML models that allowed for adaptive 
positive selection fitted the data better than those which 
did not (M3 versus M0, p = 0; M2 versus Ml, p = 0; M8 
versus M7, p = 0) (Table 4). In total, 24 positively selected 
codons (PSCs) were identified by all three models, M2, 
M3 and M8, with co values of 4.08, 4.36 and 4.06, respect- 
ively. SLAC and FEL analyses found 2 and 28 codons 
evolving under positive selection with p-value less than 
0.1 (data not shown) and REL identified 19 sites PSCs with 
Bayes factor greater than 50 (Table 4). In total, the Data- 
monkey server analysis indicated 37 codons to be under 
selection pressure. The 24 sites indicated by the Bayesian 
approach using PAML were also selected by Datamonkey. 
All codons under positive selection were found 
within the N-terminal LRR domain, which recognises 
pathogens and 19 of these sites were present on the 
convex surface (Figure 8B, 8C). Fifteen of the 24 
PSCs were found within the LRR repeats. Only five 
of the 24 sites were found in beta sheets within the 
concave surface of the horseshoe-shaped domain, 
while most of the amino acids under selection pres- 
sure were on the structural components of the LRRs, 
the coils. 

Discussion 

We have characterised the full-repertoire of the highly 
expanded teleost-specific tlr family in Atlantic cod, 
which includes one tlr21, twelve tlr22 and two tlr23 
genes encoded by its genome. Phylogenetic analysis of 
tlr paralogues from 15 teleost species recovered mono- 
phyly of all tlr22 paralogues, suggesting their origin from 
a common teleost ancestor. All cod tlr22 paralogues 
were grouped under a single clade, which indicates that 
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Figure 5 Expression profile of cod teleost-specific f/rs in adult tissues and during early development. A. Tissue specific expression of 
Atlantic cod tlr21, tlr22 and tlr23 genes. Tlrs are mainly expressed in immune-related tissues such as head-kidney, kidney, spleen, liver and gills. 
Transcripts of most paralogues were also found in high levels in blood and testis. Eeflo was used as an internal reference for RT-PCR. Minus 
reverse transcriptase (-RT) and no template (NTC) controls were included to ascertain the specificity of PCR primers. Amplicon sizes in bp are 
indicated on the right hand side of the figure. B. Expression analysis of f/rs during embryonic development. Low expression of tlr21 was detected 
at later stages from hatching until first feeding, while tlr23o and tlr23b were not detected at any of the examined developmental stages. Tlr22c, 
tlr22, tlr22j and tlr22k transcripts were found in unfertilised eggs (UFE), while tlr22k was expressed at most developmental stages examined. 
Luciferase was used as an external reference for RT-PCR. 



they have likely arisen through tandem duplications. 
Cod is the first sequenced vertebrate identified to have 
lost all the mammalian cell surface and bacterial recog- 
nising TLR orthologues [14]. Based on the knowledge of 
the functional coverage of the vertebrate TLRs, 10 TLRs 
are predicted to be present in the common vertebrate 
ancestor, namely, TLR2, 3, 4, 5, 7, 8, 9, 11, 21 and 22 
[37]. Genes encoding Tlr2, Tlr4, Tlr5 and Tlrll are ab- 
sent, while Tlr3, Tlr7-9 are intracellular Tlrs. Hence, 
Tlr21 and Tlr22 are the only plausible cell surface Tlrs 
encoded by the cod genome. 

Partial synteny analysis based on the current genome 
build revealed conservation between Tlr22 encoding 
genes in cod and those in stickleback, tiger pufferfish 
and green-spotted pufferfish, within the genomic region 
containing sh3kbpl and map3J<15 genes. Sh3kbpl (SH3- 
domain kinase binding protein 1) is an adapter protein 
involved in regulating diverse signal transduction path- 
ways, while Map3kl5 (mitogen-activated protein kinase 
kinase kinase 15) plays a key role in signal transduction 
and is essential for stress-induced apoptosis [38]. Several 



tlr22 paralogues are in close proximity within the cod 
genome and seem to have arisen through tandem dupli- 
cations. There is no uniform exon-intron structure 
within these tlr22 paralogues. Full length CDS of cod 
tlr22b, tlr22d, tlr22g and tlr22i are encoded by 5, 3, 3 and 
3 exons, respectively. In the case of goldfish (Carassius 
auratus), zebrafish and rainbow trout (22 and 22 1) tlr22 
has a single exon, while the tiger pufferfish and large yel- 
low croaker orthologues are encoded by four, three and 
three exons, respectively [10,39,40]. Tlr22 genes in basal 
teleosts such as Cyprinidae and Salmonidae are repre- 
sented by a single exon, while their orthologues in higher 
teleosts (Sciaenidae, Tetraodontidae and Gadidae) con- 
tained multiple exons. This suggests that Tlr22 may have 
been encoded by an uninterrupted exon in the common 
vertebrate ancestor and has acquired additional introns 
during the evolution. According to homology, synteny and 
phylogenetic analyses, tlr22a encoded by a single exon 
(based on partial sequence) seems to be the ancestral 
Tlr22 encoding gene and the remaining eleven paralogues 
have arisen through tandem duplications. It was not 
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Figure 6 Quantification of teleost-specific Atlantic cod f/rs in 
response to bath challenge with V. anguillarum. Heatmap 
representing the expression of Atlantic cod f/rs in head-kidney, 
gills and spleen in response to bath challenge with V. 
anguillarum. After collecting initial control samples, fish were 
subjected to bath challenge with V. anguillarum strain H610 at a 
concentration of 2.6-1 0 7 cfu-ml" 1 . Samples were collected at 4 (4 
hpc) and 48 (48 hpc) h post-challenge. Relative expression of 
tlr21, tlr22 and tlr23 was determined by qPCR and expressed as 
ratios between each sample and the respective initial control. 
Significance levels were set at P < 0.05 and statistically different 
expression values are enclosed in red boxes. Eefla and ubi were 
used as internal controls. 



possible to perform a synteny analysis for tlr21 and tlr23 
paralogues, since the corresponding genomic scaffolds 
were short and did not contain more than two genes. Cod 
tlr21 is represented by an uninterrupted exon in the gen- 
ome, sharing this gene structure with zebrafish, tiger 
pufferfish and stickleback (tlr21a) homologues, while 
stickleback (tlr21b), channel catfish (Ictalurus punctatus) 
and medaka homologues are encoded by multiple exons. 
Tlr23 has been identified in two more teleosts, tiger puf- 
ferfish and green-spotted pufferfish, both comprising three 
exons each, while cod tlr23a had five exons and tlr23b is 
composed of three exons in its genome. Completion of 
current genome build will provide a better understanding 
of the origin of the various paralogues as well as synteny 
with other teleosts. 

TLRs have cysteine clusters flanking either side of the 
LRR region with two to five cysteine residues, which are 
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Figure 7 Quantification of teleost-specific Atlantic cod f/rs in 
response to temperature stress. Heatmap representing the 
expression of Atlantic cod tlrs in head-kidney and spleen in response 
to temperature stress. Adult fish were maintained at 4°C After 
collecting initial control samples, the water temperature was 
gradually increased to 12°C in 4 h (4 hps) and the fish were 
maintained at this temperature for 72 h (72 hps). Relative expression 
of X\r2i, tlr22 and tlr23 paralogues was quantified by qPCR as ratios 
between each sample and the initial control. Significance levels 
were set at P < 0.05 and statistically different expression values are 
enclosed in red boxes. Eefla and ubi were used as internal controls. 



denoted LRRCT and LRRNT domains [25]. While LRRNT 
regions are variable among TLRs, LRRCT contains a 
highly conserved consensus sequence and is known to 
play a crucial role in TLR signalling. The LRRCT forms a 
compact structure stabilised by disulphide bridges posi- 
tioning the extracellular domain of the TLR relative to the 
membrane, as seen in the structure of human TLR3 
protein [41]. Similar to other known teleost Tlr21s, the 
Atlantic cod Tlr21 protein has a CxCx 24 Cx 15 C motif at 
its LRRCT domain, while Tlr22 and Tlr23 had a 
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Table 3 Codon based Z-test of positive selection analysis between Atlantic cod tlr22 paralogues 



Atlantic cod paralogues 


tlr22b 


tlr22c 


tlr22d 


tlr22f 


tlr22g 


tlr22h 


tlr22i 


tlr22j 


tlr22l 


tlr22b 




-0.436 


-1.789 


0.002 


0.241 


0.135 


0.833 


0.779 


0.449 


tlr22c 


1.000 




2.157 


1.554 


1.186 


2.072 


2.852 


2.656 


2.787 


tlr22d 


1.000 


0.016 




0.264 


1.265 


1.722 


2.465 


1.907 


1.577 


tlr22f 


0.499 


0.061 


0.396 




1.817 


1.968 


2.345 


2.020 


2.389 


tlr22g 


0.405 


0.119 


0.104 


0.036 




0.800 


2.314 


1.131 


2.126 


tlr22h 


0.446 


0.020 


0.044 


0.026 


0.213 




0.074 


-0.427 


0.306 


tlr22i 


0.203 


0.003 


0.008 


0.010 


0.011 


0.471 




1.632 


1.893 


tlr22j 


0.219 


0.004 


0.029 


0.023 


0.130 


1.000 


0.053 




0.901 


tlr22l 


0.327 


0.003 


0.059 


0.009 


0.018 


0.380 


0.030 


0.185 





A modified Nei-Gojobori method with Jukes-Cantor correction was used. The test statistic (dN-dS) is shown above the diagonal and the corresponding P-value is 
indicated below the diagonal. P-values less than 0.05 are highlighted in bold. Positions containing gaps were eliminated for this analysis and in total 708 codons 
were included in the final dataset. 



CxCx 24 Cx 18 C motif in their LRRCT, characteristic of tele- 
ost Tlr22 proteins [11,25]. The vertebrate TLR N- 
terminal ectodomain is made up of several LRRs and is 
involved in recognising PAMPs. The ectodomain of the 
teleost-specific Tlrs in cod is made of up to 27 LRR 
repeats [25]. Full length CDS of two tlr23 paralogues 
encoded for proteins containing 27 (Tlr23a) and 14 
(Tlr23b) LRRs within their N-terminal domain. It is note- 
worthy that cod Tlr23b contains such a low number of 
LRRs, since vertebrate Tlrs contain 16 to 28 LRRs. As 
cod is the first vertebrate known to encode for two tlr23 
paralogues, it is likely that tlr23a is the ancestral gene 
and Tlr23b has lost LRRs during evolution. Homology 
modelling of Tlr22b based on human TLR3 ectodomain 
(PDB ID: 2A0Z) [41] revealed a characteristic horseshoe- 
shaped structure. The human TLR3 ectodomain is com- 
posed of 23 LRRs forming the classical horseshoe-shaped 
structure and the concave inner surface is composed of 
21 parallel beta sheets with the hydrophobic residues 
pointing inwards forming a hydrophobic core. In the cod 
Tlr22b model, LRR22 formed an external protrusion 
similar to human LRR20. LRR11 formed a very large 
regular alpha helix and protruded outwards similar to 
human LRR12. These two LRRs may be involved in the 
recognition of PAMPs as observed for LRR 12 and LRR20 
in human TLR3. 

A similar pattern of cod tlr21, tlr22 and tlr23 expression 
was observed in zebrafish tlr21 and tlr22 [11], channel cat- 
fish tlr21 [42], rainbow trout tlr22 and tlr22l [12], large 
yellow croaker (Larimichthys crocea) tlr22 [39], grass carp 
{Ctenopharyngodon idella) tlr22 [40], goldfish tlr22 [13] 
and orange-spotted grouper tlr21 [43]. Nevertheless, the 
differential expression pattern observed across tissues for 
tlr22 paralogues indicates that this gene may have diversi- 
fied to attain specific roles in different tissues of cod. To 
date, cod and grass carp [40] are the only two teleosts 
known to express a tlr22 paralogue in fast muscle. Cod 
tlr22k was expressed in skin, similarly to channel catfish 



tlr22 [42]. The skin is an important mucosal defence 
organ [28] and the presence of tlr22k transcripts may 
trigger the innate immune response by detecting PAMPs, 
once they cross the mucosal layer into the skin. Cod 
testis expressed most tlrs, similarly to zebrafish tlr22 
[11]. Several mammalian TLRs in mouse are reported to 
be involved in the testicular innate immune response es- 
pecially in Sertoli cells [44]. Thus, tlr22 expression in 
testis suggests that it may be involved in protecting the 
male reproductive tract in cod and other teleosts. 
Teleost-specific tlrs showed varied developmental expres- 
sion patterns and unfertilised eggs had tlr22c, tlr22h and 
tlr22k transcripts, possibly derived from maternal source. 
Historically, Drosophila toll was identified as a key player 
in specification of the dorso-ventral axis during embry- 
onic development and several toll genes were found to 
be expressed throughout the developmental stages [45]. 
The main focus of mammalian TLR research is on the 
immune function of the gene and less evidence of their 
role in embryogenesis is established in vertebrates. A 
recent study on mouse brain has identified specific ex- 
pression patterns of TLR7 and TLR9 expression in devel- 
oping brain, which has been linked to the development 
of the central nervous system of vertebrates [46]. In grass 
carp, tlr22 transcripts were also found during late deve- 
lopmental stages [47]. This study corroborates our data, 
suggesting that teleost-specific tlrs may also play a role in 
embryogenesis. 

Tiger pufferfish Tlr22 was originally thought to be a 
functional substitute of human TLR3, as it responds to 
dsRNA and may therefore promote antiviral protection 
in teleosts [10]. Several in vivo and in vitro studies have 
shown that teleost-specific tlrs do respond to a wide var- 
iety of PAMPs originating from bacteria and parasites 
[4]. An increase in expression of tlr22 was found in LPS 
stimulated macrophages as well as in LPS, Aeromonas 
salmonicida or Mycobacterium cheloni stimulated leuco- 
cytes in goldfish [12]. LPS, peptidoglycan and poly(I:C) 
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Figure 8 Codons under positive selection in Atlantic cod Tlr22 paralogues and their location within Tlr22b. A. Cumulative non- 
synonymous (green) and synonymous (red) substitutions for all pairwise comparisons between nine Atlantic cod tlr22 paralogues. The ratio of 
non-synonymous (dN) over synonymous (dS) substitution is greater in the LRR region than in the TIR domain. B. Multiple sequence alignment of 
cod Tlr22. Amino acid residues identical to Atlantic cod Tlr22b are represented by a dot and alignment gaps are indicated by a dash. LRR regions 
are shaded in grey and positively selected sites are boxed in red. The cysteine cluster within the LRRCT domain is marked in green. C. Predicted 
structure of Atlantic cod Tlr22b. LRR region with the positively selected sites highlighted in black. Their amino acid position is indicated by 
arrows. 
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Table 4 Identification of positively selected sites in Atlantic cod tlr22 paralogues by maximum likelihood analysis 
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Only positively selected sites with Bayesian posterior probabilities above 95% are indicated and the ones greater than 99% are highlighted in bold. 
In the REL analysis, positively selected sites with a Bayes factor greater than 50 are highlighted in bold. 



as well as M. marinum up-regulated expression of 
tlr22 in larvae and adult zebrafish, respectively [11,48]. 
Continuous exposure of rainbow trout PBL, spleen 
and kidney to inactivated A. salmonicida, induced up 
to 8-fold increase in expression of tlr22 and tlr22l 
after 24 h. Following stimulation with poly(I:C), high 
levels of tlr22 transcripts in spleen of large yellow 
croaker [39]. A similar effect was seen on the expression 
of tlr22 in grass carp infected with reo virus [40]. In the 
present study, bath challenge with V. anguillarum 
induced a 2.1 -fold increase of tlr22f, tlr22h and tlr22k 
transcript levels in spleen at 48 hpc compared to the ini- 
tial control. In general, most of the genes analysed in 
this study responded to bacterial bath challenge across 
the three tissues that were examined, but with some 
tissue-specific responses. In particular, tlr22f and tlr22k 



were down- regulated in gills but up-regulated in spleen 
at 48 hpc. Our data revealed that in addition to recog- 
nising dsRNA, teleost-specific tlrs respond to PAMPs 
from bacterial origin. 

Tlr22d was down-regulated by 5.5-fold in spleen fol- 
lowing exposure to high temperature, indicating that it 
may be involved directly in the immune response to heat 
shock. Heat stress is known to induce an innate immune 
response by activating the overexpression of various 
heat shock proteins. In mammals, TLR2 and TLR4 up- 
regulation is mediated by p38 -kinase and might be 
involved in the enhanced response of PAMP in humans 
monocytes induced by head shock [49]. In the thermal 
shock experiment, heat shock protein 70 (hsp70) was up- 
regulated in skin by 3-fold at 72 hps (data not shown), 
thus confirming the effect of the temperature stress. 
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Human HSP70 released into the extracellular milieu binds 
to TLR2 and TLR4 and exerts immunoregulatory effects 
through its chaperokine activity [50]. HSP60 is also found 
to be dependent on TLR4 for induction of specific cyto- 
kines [51]. Most cod teleost-specific tlr genes were differ- 
entially regulated in this temperature stress experiment, 
indicating that they may be involved in regulating the heat 
shock as well as the immune response. 

In our previous study, we demonstrated that tlr22 genes 
from several teleost taxa (Cyprinidae, Gasterosteidae, 
Salmonidae, Adrianichthyidae, Tetraodontidae) are under 
adaptive selection pressure [11]. Pairwise comparison of 
N-terminal LRR domains amongst cod tlr22 paralogues 
showed that they are evolving at different rates. Signifi- 
cant dN-dS values greater than one were observed for 
most comparisons, the highest being between tlr22c and 
tlr22L Similar results were obtained from Tajimas relative 
rate test, which in fact revealed that tlr22d is evolving 
considerably faster than all other paralogues. This accele- 
rated divergence may account for its involvement in the 
heat shock response, since tlr22d was significantly down- 
regulated following thermal stress. 

Average non- synonymous nucleotide substitutions 
within Atlantic cod tlr22 paralogues were generally much 
higher than synonymous ones, especially within the LRR 
coding region. Five PSCs at positions 73, 170, 274, 369 
and 371 were found within the (3 sheets forming the 
hydrophobic core of the TLR ectodomain [52]. At posi- 
tions 73, 170 and 369, all nine cod Tlr22 paralogues pre- 
dominantly contained a hydrophilic residue, while 274 
and 371 were mostly hydrophobic in nature. PSCs 427, 
452, 455 and 509 are present after LRR19, LRR20, LRR20 
and LRR22, respectively, and are generally represented by 
different hydrophilic amino acids, where hydrophobic 
residues are normally found [53]. LRR11 and LRR22, 
which protrude outwards from the horseshoe-shaped 
domain that recognises PAMPs, have one and four PSCs, 
respectively. Ligand specificity may be based on variations 
in the amino acids in the solvent-exposed beta sheets or 
on variations in the convex surface of the horseshoe - 
shaped domain [53]. The substitution of hydrophilic 
amino acids for hydrophobic ones in and around the beta 
sheets will affect the polarity of the core. Also, changes i 
in other PSCs may be altering the polarity and the struc- 
ture of the ectodomain, thus producing striking variations 
in the PAMP recognising sites of Tlr22 paralogues in cod. 
Four PSCs were found to be unique to Tlr22d, which 
seems to be involved in the heat shock response. Unlike 
other Tlr 22 isoforms, the first three sites of Tlr22d had a 
negative charge (E318, E427 and E452) while the fourth 
was positive (H455). 

Positive selection within duplicate genes has been 
related to their functional diversification through neo- 
functionalisation [54,55]. Our study revealed that several 



PSCs in cod tlr22 genes may produce striking changes in 
critical protein sites and may therefore be associated with 
adaptation to evolving pathogens or acquisition of add- 
itional functions, such as the heat shock response. Hence, 
it is likely that these duplicate tlr22 genes are undergoing 
neofunctionalisation. Taken together with the observed 
asymmetric evolution rates amongst cod tlr22 paralogues, 
our data favour the adaptation model, as opposed to the 
Dykhuizen-Hartl model (reviewed in [19]). 

Conclusion 

We have identified and annotated 15 tlr genes represent- 
ing all the members of the highly expanded teleost- 
specific Tlr family in Atlantic cod, which includes 12 
tlr22 paralogues. They seem to have evolved through 
lineage-specific tandem duplications, perhaps to com- 
pensate for the absence of bacterial recognising and 
other cell surface Tlrs. The various tlr22 paralogues are 
evolving at different molecular rates and several codons 
in the region coding for their ligand binding domain are 
under adaptive selection, which may contribute to their 
functional diversification through neofunctionalisation. 
This conclusion is corroborated by experimental evidence 
of differential expression upon thermal shock and bacterial 
challenge. 
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